Misspecification in Linear Spatial Regression Models

نویسندگان

  • Raymond J.G.M. Florax
  • Peter Nijkamp
چکیده

Spatial effects are endemic in models based on spatially referenced data. The increased awareness of the relevance of spatial interactions, spatial externalities and networking effects among actors, evoked the area of spatial econometrics. Spatial econometrics focuses on the specification and estimation of regression models explicitly incorporating such spatial effects. The multidimensionality of spatial effects calls for misspecification tests and estimators that are notably different from techniques designed for the analysis of time series. With that in mind, we introduce the notion of spatial effects, referring to both heterogeneity and interdependence of phenomena occurring in twodimensional space. Spatial autocorrelation or dependence can be detected by means of crosscorrelation statistics in univariate as well as multivariate data settings. We review tools for exploratory spatial data analysis and misspecification tests for spatial effects in linear regression models. A discussion of specification strategies and an overview of available software for spatial regression analysis, including their main functionalities, intend to give practitioners of spatial data analysis a head start. * This paper is forthcoming in the Encyclopedia of Social Measurement, San Diego: Academic Press, 2004, edited by K. Kempf-Leonard. The authors would like to thank Henri L.F. de Groot, Thomas de Graaff, and two anonymous reviewers for helpful comments. 1. Spatial effects The awareness and incorporation of space is evidently fundamental to geography. In 1979, it spurs the geographer Waldo Tobler to formulate the first law of geography, stating “everything is related to everything else, but near things are more related than distant things.” The relevance of spatial effects extends, however, beyond geography, and is ubiquitous in many of the social sciences. Cliff and Ord (1981) start-off their seminal book about models and applications of spatial processes with an example from epidemiology. They analyze the spatial mortality pattern from cholera in London, 1848–1849, and attribute the high incidence of the disease in the London metropolis to the high organic content of the Chelsea, Southwark and Vauxhall waters. Similarly, spatial analysis techniques are used to analyze diffusion–adoption patterns of innovations by farmers, to prescribe spatially differentiated fertilizer doses in precision agriculture, to explain network effects among individuals, and to investigate human– environment interactions in processes of environmental degradation. Economists have traditionally been more reluctant to consider space as a relevant factor. In 1890, the economist Marshall acknowledges the role of space, maintaining that the working of the market depends “... chiefly on variations in the area of space, and the period of time over which the market in question extends; the influence of time being more fundamental than that of space.” It takes until the 1950s, however, before Walter Isard opposes this, what he calls, “Anglo-Saxon bias” that repudiates the factor of space, compresses everything within the economy to a point so that all spatial resistance disappears, and thus confines economic theory to “a wonderland of no spatial dimensions.” It is an ongoing debate ever since, whether space is merely a geographical facilitator or medium for movement, or whether space has an intrinsic explanatory function. The Dixit-Stiglitz revolution in economic theory increases the awareness for imperfect competition and increasing returns to scale that is subsequently apparent in the New Economic Geography of Krugman, Fujita, and others. Nowadays, spatial dimensions are taken into account in the study of, for instance, economic growth, high-tech innovations, urban economics, public sector productivity, fiscal policy interdependence, and international trade. 1.1 Spatial dependence and spatial heterogeneity Spatial effects include spatial heterogeneity and spatial dependence. Spatial heterogeneity refers to structural relations that vary over space, either in a discrete or categorical fashion Misspecification in Spatial Regression Models 2 (for instance, urban vs. rural, or according to an urban hierarchy), or in a continuous manner (such as on a trend surface). Spatial dependence points to systematic spatial variation that results in observable clusters or a systematic spatial pattern. These descriptions already show that in an observational sense, spatial dependence and spatial heterogeneity are not always easily discernable. The clustering of high values in, for instance, urban areas and urban fringes can be interpreted as spatial clustering of high values pertaining to urban areas and low values to rural areas, but it may as well be viewed as spatial heterogeneity distinguishing metropolitan areas from their hinterland. The typical feature of spatial dependence or spatial autocorrelation is that it is twodimensional and multidirectional. An observation of an attribute at one location can be correlated with the value of the same attribute at a different location, and vice versa, and the causation pattern can occur in different directions. Figure 1 shows two identical (7×10) regular grid systems with distinct spatial distributions of the same values. The absolute location of the non–zero values is the same in both grids, but graph (a) shows a clustering of relatively low values on the left-hand side, and high values to the right. Graph (b) shows a much more random spatial allocation of values. In terms of spatial effects, we note that the distribution in graph (a) exhibits spatial dependence and spatial heterogeneity, whereas the graph (b) does not. The occurrence of spatial heterogeneity does not necessarily have severe implications for the information that can be obtained from a spatial data series. Spatial autocorrelation does, however, because an observation is partly predictable from neighboring observations. A series of spatially dependent observations therefore contains less information. This is similar to the situation in time series analysis, where a forecast with respect to the future can be partly inferred from what happened in the past. The two-dimensional and multidirectional nature of spatial autocorrelation makes that the spatial case is more complex. 1.2 Spatial econometrics The history of the analysis of the spatial autocorrelation problem goes back to the work of statisticians such as Moran, Geary and Whittle in the late 1940s and early 50s. The development of the literature is rather slow until Cliff and Ord publish their seminal book about spatial autocorrelation in 1973. Their book focuses on the statistical analysis of spatial data series, although not exclusively from a spatial statistical point of view because there is also some attention for modeling. The modeling context is, however, much more pronounced in the efforts of the Dutch–Belgian economist Jean Paelinck, who coins the term ‘spatial Misspecification in Spatial Regression Models 3 econometrics’ in the early 1970s. Paelinck and Klaassen jointly write the first monograph on spatial econometrics in 1979, stressing the need to explicitly model spatial relations, epitomizing the asymmetry in spatial interrelations and the role of spatial interdependence. The edges of the field are in those days pushed ahead mainly by Dutch regional economists (Bartels, Brandsma, Hordijk, Ketellapper, and Nijkamp). Later, the center of activity shifts to the US, where both economists and geographers concentrate on introducing new statistical tests (particularly, tests developed in a maximum likelihood framework) and the specification and estimation of spatial regression models. During the late 1990s, these methodological developments are gradually being picked up in applied spatial research, among other things facilitated by the availability of spatial econometric software. The most comprehensive book of ‘modern’ developments is Anselin’s 1988 book on methods and models of spatial econometrics, where he defines spatial econometrics as “the collection of techniques that deal with the peculiarities caused by space in the statistical analysis of regional science models.” The modeling perspective, already pondered by Paelinck, distinguishes spatial econometrics from the broader field of spatial statistics (Cressie, 1993). Good theoretical overviews of spatial econometrics are available in Anselin and Bera (1998), and Anselin (2001). The tutorial of the SpaceStat software is available online (http://www.terraseer. com/spacestat.html) and provides a good introduction. LeSage’s contribution to the Web book of regional science gives a more applied, hands–on introduction to spatial econometrics (http://www.rri.wvu.edu). 1.3 Article organization This article continues as follows. In Section 2, we discuss test statistics for spatial autocorrelation that have been developed, and their concurrent measurement level of the data. We cover various techniques for exploratory spatial data analysis, in particular those techniques that assist in determining the correct specification of regression models. Several frequently used models are introduced in Section 3, and in Section 4 we present misspecification tests for spatial dependence. It is common that in practical applications it is difficult to determine what the actual data generating process is on the basis of a series of misspecification tests. Little is known about this specific aspect of spatial econometric modeling. We review the limited knowledge about specification search strategies in Section 5. At the same time, we provide guidelines for practitioners of spatial data analysis by reviewing software that contains tools for misspecification testing and estimation of spatial process models. Section 6 concludes. Misspecification in Spatial Regression Models 4 2. Testing for spatial autocorrelation 2.1 Definition spatial autocorrelation The seminal work of Cliff and Ord (1973, 1981) has induced extensive attention for the statistical properties of spatial data, in particular for spatial autocorrelation or dependence. Statistical tests for spatial association or dependence are always based on the null hypothesis of spatial independence. The general notion of independence is easily formalized as:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spatial measurement error and correction by spatial SIMEX in linear regression models when using predicted air pollution exposures.

Spatial modeling of air pollution exposures is widespread in air pollution epidemiology research as a way to improve exposure assessment. However, there are key sources of exposure model uncertainty when air pollution is modeled, including estimation error and model misspecification. We examine the use of predicted air pollution levels in linear health effect models under a measurement error fr...

متن کامل

Bayesian Inference for Spatial Beta Generalized Linear Mixed Models

In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...

متن کامل

On the Simultaneous Effects of Model Misspecification and Errors-in-Variables

Misspecified models and noisy covariate measurements are two common sources of bias in statistical inferences. While there is considerable literature on the consequences of each problem in isolation, this article investigates the effect of both problems in tandem. In the context of linear models, the large-sample error in estimating the regression function is partitioned into two terms, one res...

متن کامل

Dual Model Misspecification in Generalized Linear Models with Error in Variables

We study maximum likelihood estimation of regression parameters in generalized linear models for a binary response with error-prone covariates when the distribution of the error-prone covariate or the link function is misspecified. We revisit the remeasurement method proposed by Huang, Stefanski, and Davidian (2006) for detecting latent-variable model misspecification and examine its operating ...

متن کامل

Forecasting with spatial panel data

Forecasting with Spatial Panel Data This paper compares various forecasts using panel data with spatial error correlation. The true data generating process is assumed to be a simple error component regression model with spatial remainder disturbances of the autoregressive or moving average type. The best linear unbiased predictor is compared with other forecasts ignoring spatial correlation, or...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003